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STORE-SUBCHANNEL-QDIO DATA HAS THIS FORMAT: 



WORD 0 
1 
2 
3 



'0010' 
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SELF-CONTAINED QUEUES WITH 
ASSOCIATED CONTROL INFORMATION 
FOR RECEIPT AND TRANSFER OF 
INCOMING AND OUTGOING DATA USING A 
QUEUED DIRECT INPUT-OUTPUT DEVICE 

CROSS REFERENCE TO RELATED 
APPLICATIONS 

This application is related to the following copending 
U.S. patent applications Sen Nos. 09/253,246, 091253,250, 
09/253,247, 09/252,712, 09/252,552, 09/252,728, 09/252, 
730, 09/253,101, 09/253,286, 09/252,542, 09/253,249, 
09/252,556, 09/253,993, 09/253,658, 09/252,555, 09/255, 
641, 09/255,640 and 09/252,727. 

FIELD OF INVENTION 

The subject of the present invention in general pertains to 
a new Input-Output facility design that exploits high band- 
width integrated network adapters. 

BACKGROUND OF THE INVENTION 

In a network computing environment, multitudes of com- 
mands and requests for retrieval and storage of data are 
processed every second. To properly address the complexity 
of routing these commands and requests, environments with 
servers have traditionally offered integrated network con- 
nectivity to allow direct attachments of clients such as Local 
Area Networks (LANs). Given the size of most servers, the 
number of clients usually is in the range of hundreds to 
thousands and the bandwidth required in the 10-100 Mbits/ 
sec range. However, in recent years the servers have grown 
and the amount of data they are required to handle has growo 
with them. As a result, the existing I/O architectures need to 
be modified to support this order of magnitude increase in 
the bandwidth. 

In addition, new Internet applications have increased the 
demand for improved latency. The adapters must support a 
larger number of users and connections to consolidate the 
network interfaces which are visible externally. The combi- 
nation of all the above requirements presents a unique 
challenge to server I/O subsystems. 

Furthermore, in large environments such as International 
Business Machines Enterprise System Architecture/390 
(Enterprise System Architecture/390 is a registered trade- 
mark of International Business Machines Corporation), 
there are additional requirements that the I/O subsystem 
must remain consistent with existing support. Applications 
must continue to run unmodified, and error recovery and 
dynamic configuration must be preserved or even improved. 
Sharing of I/O resources must be enabled as well as the 
integrity of the data being sent or received. This presents 
new and complex challenges that need to be resolved. 

In order to achieve bandwidths which are dramatically 
higher and still achieve other required challenges, a new 
system architecture is needed, 

SUMMARY OF THE INVENTION 

A queuing method and apparatus for receipt and transfer 
of incoming and outgoing data inn a network environment 
having a main storage. The mechanism includes at least one 
set of dedicated input queues and at least another set of 
dedicated output queues. In addition a plurality of queuing 
components is also provided that include attributes of 
devices to and from which data is to be transferred or 
received, and information about the queuing mechanism 
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itself. The input and output queues also comprise an infor- 
mation block containing address of all input and output 
queues, a storage information block providing information 
about the queuing mechanism and storage list information 

5 blocks that defined for each queue containing specific infor- 
mation about that queue itself. In addition, the input and 
output queue sets include storage lists for identifying any 
input-output buffers) associated with each queue-set and a 
storage block address list for providing information about 

10 storage locations of any input-output buffer. 

BRIEF DESCRIPTION OF THE DRAWINGS 

The subject matter which is regarded as the invention is 
particularly pointed out and distinctly claimed in the con- 
15 eluding portion of the specification. The invention, however, 
both as to organization and method of practice, together with 
further objects and advantages thereof, may best be under- 
stood by reference to the following description taken in 
connection with the accompanying drawings in which: 

20 

FIG. 1 is an illustration of a network computing environ- 
ment utilizing a channel subsystem and a control unit; 

FIG. 2 is an illustration of a network computing environ- 
ment as per one embodiment of the present invention; FIG. 
25 2A shows how the use of some channel and control unit 
functions while FIG. 2B shows the details of the Interface 
element; 

FIG. 3 is an illustration of a queuing mechanism as per 
one invention of the present invention; 
30 FIG. 4 illustrates SETUP SDU fields; 

FIG. 5 represents the format for the command request 
block for store-subchannel-QDIO data; 

FIG. 6 represents the format for the command response 
3s block for the store-subchannel-QDIO data command; 

FIG. 7 is a tabular illustration of the contents of input 
queues as per one embodiment of the present invention; 

FIG. 8 is a tabular illustration of the contents of output 
queues as per one embodiment of the present invention; 
40 FIG. 9 is an example of a queue information block content 
as per one embodiment of the present invention; 

FIG. 10 is an example of a SLIB block content as per one 
embodiment of the present invention; 

FIG. 11 is an example of a SLIBE block content as per one 
45 embodiment of the present invention; 

FIG. 12 is as an example of a Storage List content as per 
one embodiment of the present invention; 

FIG. 13 is an example of a SBALE content as per one 
50 embodiment of the invention; and 

FIG. 14 is an example of a Storage-List-State-Block 
content as per one embodiment of the present invention. 

DETAILED DESCRIPTION OF THE 
INVENTION 

55 

An example of an existing data processing system archi- 
tecture is depicted in FIG. 1. As shown in FIG. 1, informa- 
tion is passed between the main storage 110, and one or more 
input/output devices (hereinafter I/O devices) 190, using 

60 channel subsystems 150. Through the switch 160, channel 
paths are established, comprising channels 155 and one or 
more control units shown at 180, These channel paths are the 
communication links established between the I/O devices 
190 and the main storage for processing and exchange of 

65 information. 

The main storage 110 stores data and programs which are 
input from I/O devices 190. Main storage is directly addres- 
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sable and provides for high speed processing of data by nents of the Interface element as shown at 240 and 260 

central processing units and one or more I/O devices. One respectively. The present invention still allows the use of 

example of a main storage is a customer's storage area and most programming and code structure of the existing 

a system area (not shown). I/O devices 190 receive infor- architecture, but provides a much faster and more efficient 

mation or store information in main storage. Some examples 5 system by bypassing the need for addressing many of the 

of I/O devices include card readers and punches, magnetic- existing required functions such as the multitudes of channel 

tape units, direct-access storage devices (DASD), displays, commands, by eliminating the need for many processing 

keyboards, printers, teleprocessing devices, communication steps. 

controllers and sensor-based equipment. ^ architecture of the present invention can be better 

The main storage is coupled to the Storage Control 10 ^picted in the configuration nyrescnlcd by HG. 2B The 

Element (SCE) 120 which in turn is coupled to one or more Rector Interface Element shown at 240 can include a 

central processing units (CPU) 130. The central processing ^rikty of processors, at least one of which is used for 

unit(s) is the control center of the data processing system and red ™ danc y P *Tt^Y interface cards. An direct 

typically comprises sequencing and processing facilities for ™™W aWa ' hed J£ device such as a Self-Timed Interface 

instruction execution, initial program loading and other 15 ^ hereinafter STI bus (shown at 230) as used in one 

related functions. The CPU isusually coupled to the SCE via f ^diment of the present invention connects the Connec- 

a bidirectional or uni-directional bus. Tne SCE, which tor ^^^n ?T U ° ^ nte- 

, ! , . c t j i ii enced to as the host) which in turn can be connected to a 

controls the execution and queuing of requests made by the . t c . , i i ^ i_ . *™ 

CPU and channel subsystem, is coupled to the main storage, va "f * of ^ ne,w0 * C * ™ ! S° 

CPUs and the channel subsystem via different busses. 20 ™*> •» web-servers and other TCP/IP onented servers. The 

„, , ,. , ■ ... . Connector Interface Element is in processing communica- 

The channel subsystem directs the flow of formation don ^ me Network Interface Element shown at 260 vk 

between VO devices and main storage and relieves the CPUs in0|her ^ m aUached I/Q device such as , Peri h . 

of the task of communicating directly with the I/O devices er&] Controller Interface bus, hereinafter PCI bus as shown 

so that data processing operations directed by the CPU can at 250 ^ used to one ernbodiment of the present inve ntion. 

proceed concurrently with I/O processing operation. The » The device adapters, at least one or more processors and 

channel subsystem uses one or more channel paths as the SQme local s resUe ia ^ Network i ntelface Element, 

communication links in managing the flow of information to Consequently, the Network Interface Element is connected 

or from I/O devices. Each channel path consists of one or , 0 mdividua ] application users depicted at 270 such as Lotus 

more channels, located within the channel subsystem, and Notes clicnts and Web br0WS6rs . 

one or more control units. In one preferred embodiment, a JU , 4 , .„ , • 1 r j . c _ 

„.„.,„ - i_ < Data streams and requests lor retrieval or data from 

sute K ^ P ^ servers bv ^Plication users is transferred via the Inter- 

^ face Element to the main storage where a plurality of queues 

As can be seen in FIG. 1, it is also possible to have one can be selup for process ing ^d storage of the data while 

or more dynamic switches or even a switching fabric 35 providing the advantage of bypassing any need for causing 

(network of switches) included as part of the path, coupled ^ interrupt in the main program . The status of the network 

to the channels) and the control unit(s). Each control umt is ^ thcn updated to reflect the changes. Once the appropriate 

further attached via a bus to one or more I/O device(s). response or data is retrieved from the servers, these multiple 

The subchannel is the means by which the channel queues are interrogated simultaneously to determine the 

subsystem provides information about associated I/O 4Q appropriate application server that the data needs to be sent 

devices to the central processing units; the CPUs obtain this t o. Subsequently, data from the servers is also transmitted 

information by executing I/O instructions. The subchannel v i a the Interface Element to the application users in the same 

consists of internal storage that contains information in the manner by establishing and interrogating the queues, 

form of a channel command word (CCW) address, channel ^ qiieumg mcc hanism needs to be explained in more 

path identifier, device number, count, status indications, and <$ detail ^ queu ing mechanism of the present invention is 

I/O interruption subclass code, as well as information on referenced to as the Queued Direct I/O (QDIO) facility and 

path availability and functions pending or being performed. comprises communication stacks. The input and output 

I/O operations are initiated with devices by executing I/O queues or 5oth may be provi d e d. When the QDIO input 

instructions that designate the subchannel associated with queues are provided, the program can directly access data 

the device. 50 p i ace d into the input queues by the adapters) of the Inter- 

The execution of input/output operations is accomplished f ace Element. Typically, the source of the data placed into 

by the decoding and executing of CCWs by the channel such input queues originates from an I/O device or network 

subsystem and input/output devices. A chain of CCWs 0 f devices to which the adapter is connected, 

(input/output operations) is initiated when the channel trans- Correspondingly, when the QDIO output queues are 

fers to the control unit the command specified by the first 5S provided, the program can transmit data directly to the 

channel command word. During the execution of the speci- adapter by placing data into the appropriate output queues, 

fied chain of I/O operations, data and further commands are Depending on the adapter, the data placed into such output 

transferred between the channel(s) and the control unit(s). queues may be used internally by the adapter or may be 

As explained earlier, in order to achieve bandwidths transmitted to one or more I/O devices to which the adapter 

which are dramatically higher and move from 100 Mbits to 60 is connected. 

Gbit technologies, a combination of improvements is The build in queues set are located in the program storage 
required. and are separate from the data control traffic. In a preferred 
FIG. 2 depicts the network environment of the present embodiment up to 240 queue sets are provided. A direct 
invention. FIG. 2A depicts how the existing channel sub- adapter storage interface is also provided to minimize inter- 
system and control units is replaced by an Interface element 65 rupts and other overhead. Each queue set in the mechanism 
as shown at 200 along the path 210. A Connector Interface provides for separate outbound and inbound queues; in one 
Element and a Network Interface Element are also compo- preferred embodiment, four outbound and at least one 
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inbound queue. Each application is assigned to at least one into one or more QDIO output queue buffers that are in the 

queue set which comprises a number for input or output output buffer empty state, output buffer not initialized state, 

queues, and each queue set can share one or more adapters. or output buffer error state and then changes the state of each 

The queue sets provide for a list of useable buffers and also suc h buffer to the output buffer primed state. The program 

a list of storage blocks for incoming/outgoing data. The 5 executes a Signal Adapter instruction which designates the 

buffers are further prioritized to address specific application function in order to signal the adapter that one or more 

needs. At initialization time and subsequently when desired output queues now have data to be transmitted to the I/O 

or a change is required, queues are initiated for each device attached to the adapter. Asynchronous to the execu- 

application(s). Queues are naturally static at initialization ti on 0 f the program, the QDIO adapter transmits the data in 

time when they are flexibly defined but as new applications 10 cac h qtjiq output buffer that is in the output buffer primed 

are being assigned, the . queuing becomes dynamic and sta t e to the attached I/O device. Upon completion of 

updates are made at intervals or continuously, as desired, to transmission, the adapter changes the state of each such 

reflect the latest nature of them. buffer to the output buffer empty state in order to make the 

For both QDIO input and output queues, main storage is buffer available for reuse by the program, 

used as the medium by which data is exchanged between the 15 Additionally, each data buffer also has an ownership state 

program and the adapter. Additionally, these queues provide which identifies either the program or the adapter as the 

the ability for both the program and the adapter to directly controlling element of the buffer for the period of time that 

communicate with each other in an asynchronous manner element is responsible for managing and processing the 

'which is both predictable and efficient without requiring the buffer. Additionally, the queuing mechanism provides for a 

services of a centralized controlling mechanism, such as an 20 prioritization scheme for the queues. Device addresses are 

Operating System Input/Output Supervisor, and the result- used as queue anchors, retaining I/O heritage to reduce cost, 

ing overhead such a control mechanism implies. Both input Queue Components 

and output queues are constructed in main storage by the FIG. 3 depicts the control structure overview for the input 

program and are initialized and activated at the QDIO. and output queues associated with a QDIO subchannel. FIG. 

adapter, as described below. Each queue consists of multiple 25 3 also demonstrates the queue components as defined for the 

separate data structures, called queue components, which . present invention. The Queue Information Block (QIB) 

collectively describe the queues* characteristics and provide contains information about the collection of QDIO input and 

the necessary controls to allow the exchange of data between output queues associated with a given subchannel. It pro- 

the program and the adapter. vides information for collection of input and output queues 

A Queuing status block is established to reflect the 30 for the adapter associated with the subchannel. One QIB is 

changes dynamically as per the changing I/O activity status. defined per QDIO subchannel; FIG. 9 provides the format of 

The queues comprise buffers which reflect channel owner- queue-information block as per one embodiment of the 

ship in the channel subsystem, and the ownership also gets present invention. 

updated as the picture dynamically changes. The queue sets The Storage List Information Block (SLIB) provides for 

are connected via the adapter to the host/main storage. In 35 the address of information stored pertaining to each queue, 

one preferred embodiment where separate images are pro- One SLIB is defined for each queue. SLIB contains infor- 

vided for virtual systems, each virtual system can also be mation about a QDIO queue and has a header and entries 

assigned a separate queue set in the queuing mechanism. called storage -list -information-block entries containing 

Exchange of Data information about each of the buffers for each queue. FIG. 

The program and the QDIO adapter use a state change 40 10 provides SLIB format as per one embodiment of the 

signaling protocol in order to facilitate the exchange of data. present invention. Furthermore, a storage list information 

This protocol is applied to each input and output data buffer block element or SLIBE can be provided containing infor- 

associated with each of the active input and output queues. mation regarding the QDIO data buffer as determined by the 

Both input and output buffers are managed and exchanged corresponding SL entry. FIG. 11 depicts a sample SLIBE 

between the program and the adapter by placing the buffer 45 content. 

into various states which are maintained in a special location The Storage list or SL defines the SBAL or storage block 

that is set aside and is associated with each buffer. For address lists that are defined for each I/O buffers associated 

example for input queues, asynchronous to the execution of with each queue. One SL is defined for each queue which 

the program, the QDIO adapter places data received from contains an entry for each QDIO-I/O buffer associated with 

the associated I/O device into input buffers that are in the 50 the queue. SL provides information about the I/O buffer 

input buffer empty state. For each input buffer that has data locations in main storage. As per one embodiment of the 

placed into it by the adapter, the state of the buffer is changed present invention, FIG. 12 provides a sample SL content. SL 

from input buffer empty to input buffer primed. The program also provides the absolute storage address of a storage block 

then examines in sequence (such as round robin) the state of address list. In rum, SBAL contains a list of absolute 

all input buffers associated with all QDIO input queues and 55 addresses of the storage blocks that collectively make up one 

processes the data in each input buffer that is in the input of the data buffers associated with each queue. A storage 

buffer primed state. Upon completion of input buffer block address list entry or SBALE is also provided as part 

processing, the program may change the state of the buffer of each SBAL. Each SBALE contains the absolute storage 

to input buffer empty in order to make the buffer available address of a storage block. Collectively, the storage blocks 

for reuse by the adapter for subsequent input data from the 60 addressed by all of the entries of a single SBAL constitute 

attached I/O device. When the program changes the state of one of the many possible QDIO buffers of a QDIO queue, 

one or more input queue buffers from primed to empty, it In a preferred embodiment, the number of these possible 

executes a SIGNAL ADAPTER instruction which desig- QDIO buffers equal 128. FIG. 13 provides for the format of 

nates the read function in order to signal the adapter that one a SBALE as provided by one embodiment of the present 

or more input buffers are now available for use. 65 invention. SBALF or SBAL Flags contain information about 

Similarly, for output queues, asynchronous to the execu- the overall buffer associated with the SBAL containing each 

tion of the QDIO adapter, the program places output data SBALE, and not just about the storage block associated with 
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each SBALE. The description of contents of the SBALF QDIO adapter to stop examining and processing all queues 

field is different for each SBALE within the SBAL. associated with the subchannel. This includes: a program 

A Storage-List-State Block or SLSB contains state indi- initiated action such as clear or halt subchannel that desig- 

cators that provide state information about the QDIO buffers nates a QDIO subchannel, an error condition (including 

that make up a queue. A QDIO buffer consists of the S errors within QDIO adapter, the channel subsystem or 

collection of storage blocks that can be located using all of elsewhere in the central processing complex that affects the 

the addresses in a single storage-block-address list. Depend- state of a Q DI ° subchannel) that causes a QDIO-active 

ing on the current state value in an SLSB entry, either the subchaanel to enter a status pending with alert-status state, 

program or the QDIO control unit can change the state of the or a reset/recotuiguration action initiated by the program or 

corresponding QDIO buffer by storing a new value in the 10 °P crator " C K \ thc **** ° f thc QDI ° ada k ptcr t0 

x?ir* u o . ' t„ ct cu f,™,* ^ _ _ process QDIO subchannels or their queues, such as a 

entry. FIG. 14 provides a sample SLSB format as per one S ec0Qfigure _ channe i_ path that deconfigures the 

embodmient of the present invenfcon. SLSB also provides omy available QDl0 -channel pa th to which a QDIO sub- 

for a SQBN or state of queues buffer N which provides the cnanrje i & associated. 

current state of the corresponding QDIO buffer. The QDIO ^ dcsign of thc prcscnt providcs mc ability lo 

buffer that corresponds to a given SLSB entry is determined 15 snare acce ss to this device across multiple communication 

by the storage list entry having the same sequential position stac ] (Sj multiple priorities and multiple virtual guests and/or 

in the storage list as the SQBN field has in the SLSB. In one multiple logical partitions. A new mechanism for mapping 

embodiment, the state value consists of two parts, bits 0-2 various resources to queues which are serviced by the 

indicate whether the buffer is owned by the program or the microcode is devised to facilitate resource allocation and 

QDIO control unit and whether the buffer is an input or 20 dynamic configuration, including single point of definition, 

output buffer. Bits 3-7 contain a value that indicates the This new mechanism includes a new control path interface 

current processing state of the buffer. In this embodiment to facilitate initialization of the configuration parameters and 

different bits can also be identified to mean different con- the queue structure(s). This includes dynamic expanding of 

figurations. For example, bit zero can be established to the number of queues and queue elements as traffic patterns 

indicates program ownership, while bits 1 and 2 provide for 25 and feedback indicate. The organization of control blocks is 

QDIO control unit ownership and buffer type respectively. critical to minimize the amount of data which needs to be 

Bits 3-7 can contain a binary value that indicates the current translated across the various software layers, given virtual 

processing state of the associated buffer such as empty addressing constraints relative to page fixings as required by 

(available for data storage), primed (available to be the I/O. 

processed), not initialized (not available for use), or halted 30 As the data comes in through the adapter, a buffer is 

(contains valid data but data transfer was prematurely halted assigned to it and in this way, cache pollution is avoided. The 

by program executing Halt Subchannel), and Error channel subsystem in this configuration still operates in the 

(associated buffer is in an error state and contents of buffer traditional mode for the control flow but in the new manner 

are not meaningful). explained above for data flow providing an interrupt free 

Storage Blocks or SBs are storage blocks that are defined 35 outbound traffic. The inbound traffic has to allow for inter- 
collectively to define a single I/O buffer. rupts. For the inbound traffic, it is not always obvious as 

The overall process by which QDIO queues are used to when the data arrives exactly and the mechanism allows for 

exchange data between the program and a QDIO adapter is selective use of interrupts. In one embodiment there is even 

as follows: an adaptive rate established between the interrupts and the 

1) The program constructs one or more input queues 40 polling rate. Hence, inbound interrupts only take place 
and/or output queues in main storage. The maximum num- during low data rates. 

ber of such queues that a QDIO recognizes depends on the Queue Priority and Sequencing 

type and model of the adapter. These limits can be used by Both input and output queues are processed by a QDIO 

a CHCS or Store_Subchannel_QDIO_data command, adapter in priority sequence as follows: 

2) The program transmits the main storage location of 45 1) The lowest numbered queue has the highest priority 
each input or output queue to the QDIO adapter by use of an and the highest numbered queue has the lowest priority. 
establish_QDIO_Queues channel command. To accom- 2) For output queues, the adapter processes primed state 
plish this, a Start Subchannel command instruction is also buffers for the highest priority output queue before process- 
executed which designates a QDIO subchannel that is asso- ing buffers associated with the next highest priority output 
ciated with the QDIO adapter. 50 queue. 

3) Upon successful completion of the establish__QDIO_ 3) For input queues, adapter processing is dependent on 
queues command, the program then activates the queues at the type of QDIO channel path to which it is configured. For 
the QDIO adapter by executing an activate_QDIO__queues adapters configured to OSADE channel paths, the adapter 
channel command. Upon its successful completion, the processes incoming data according to the inherent priority of 
subchannel is placed into the subchannel-active state and the 55 the data, placing the data into empty state buffers of the 
QDIO-active state. Again a Start Subchannel is used to queue with the associated priority. 

accomplish this. Alternatively, the active„DIO -queues com- 4) Depending on the type of QDIO adapter and the model, 

mand may be command chained to a previous establish„ input queues may have priority over output queues, vice 

QDIO-queues command when Start Subchannel is executed versa, or no defined priority may exist between the two. 

in the previous step. 60 5) For both input and output queues, each queue is 

4) Upon activation of the queues, both the program and processed in a sequential round robin manner starting with 
the adapter can a synchronously transmit data to each other the buffer associated with SBAL 0, called buffer 0, and 
by appropriate use of the queues as long as the designated continuing until the buffer associated with the last SBAL or 
subchannel, with which the queues are associated, remains buffer, is processed at which point processing starts again 
in a sub-channel active and QDIO-active state. 65 with buffer 0. 

5) Any action that causes a QDIO subchannel to exit the For input queues, each buffer in the input buffer empty 
subchannel_active and QDIO-active states causes the state is sequentially processed until the adapter encounters a 
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buffer that is not in the empty state or no more input data is cost of I/O. If one could have a zero impact I/O structure, a 

received. The adapter then processes the non-empty state ULP would be free to optimize for its environment rather 

buffer by looking at whether the input buffer is primed, input than conform to rules determined by an I/O structure, 

buffer not initialized, or input buffer error state is detected. In the present invention a new controller area is denned, 

When it sees an adapter in any of these states, the process of 5 during the initialization time, a numeric value is passed 

scanning the remaining queues entries is suspended until t0 ULP ENABLE which specifies the amount of buffer space 

either an interval or time has elapsed, a SIGNAL ADAPTER needed to build a header required by the adapter, preferably 

read function is executed, or additional input from the device » GigaEnet adapter. A connection manager will then pass 

or network of devices is detected. This process is continued ^ value to all ULP s that wish to utilize : the adapter, and 

*-i*i_Lrr i_ »l a 4 * 4 . L- l « during data flows, all datagrams sent will have that amount 

until the buffer reaches an input buffer empty state at which 10 - * u . Z ^ j.uj* ~ t-u* 

- * . j r ' At A . . of storage between the header and the datagram. This 

time it is processed and the adapter resumes the sequential methodo f removes ^ need for locating storage in the 

processing of the remaining queues entries. If the Input daU path 0f adaptef headef placement which in turn ^ 

buffer is in any other state, the adapter terminates the ^ overaU system mroughputi In addition the present 

processing of all queues for the associated QDIO subchan- invention pro vides for the sharing of network attachment 

ne l* 15 with each ULP owning its own device address. 

For output queues, each output buffer primed state buffer Important Instructions 

is sequentially processed until the adapter encounters a The present invention provides for several novel instnic- 

buffer that is not in the primed state or until a model tions and commands that does not exist in the present 

dependent "fairness" algorithm causes the adapter to process technology. The first of these is called a Signal Adapter 

the next lower priority output queue. When an output buffer 20 Instruction, hereinafter SIGA instruction. The SIGAinstruc- 

that is not in the output buffer primed state is detected, the tion comes in several flavors such as a read, a write, and a 

adapter processes the non-primed state buffer as follows. synchronize SIGA The command is primarily established to 

When the output buffer is empty, output buffer is not give operational initiative that is missing from the existing 

initialized or is in an error state, the adapter suspends the systems. The SIGA instruction works almost like a wake-up 

process of scanning until an interval has passed or a SIG- 25 call > reminding the system to go and check its queues and 

NAL ADAPTER write function is executed. Depending on process what is pending. It functions as a mid-I/O intrusion 

the model, when one or more of these events occur, the fraction that is designated for the checkmg of the queues. 

, . , ' • ,* CT CT3 c„ r f u„ T/n ^ is an I/O operational signal structure which in case of its 

adapter agam accesses the SLSB entry tor the same I/O , . . „ • j * ■ 4 u 

, J: iL f i j * * j u • * c *u synchronization flavor, synchronizes the data in the queues 

buffer that was previously detected as being in one of these J ^ ^ ^ fe ned out and ^ ^eues 

states ,thc > adapter again suspends processing of that queue. 30 are ^ It can be Stated b a m timC r if 

If the buffer is now in the output buffer primed state, the desired 

buffer is processed and the adapter resumes the sequential [n a pre f erred embodiment of the present invention, the 

processing of the remaining queue entries. If the output SIGA comprises an eight bit function code and if called for, 

buffer is in any other state, the adapter terminates the a 32 bit parameter is transmitted to the adapter. The follow- 

processing of all queues for the associated QDIO subchan- 35 mg i s an example of a SIGA structure, 
nel. 

The above configuration provides for interlock data i.siga 

movement avoidance between the queue mechanism where d 2 b 3 [S] 

the application can place network data on a queue which can + - - - — - + t t 

be accessed too easily. The initiative and/or control is passed 40 | ' B274 ' I Bl I I 

for the queues between the server software and the micro- 0 16 20 31 

code as to avoid unnecessary interrupts where ownership of 

queues is passed back and forth and unnecessary data . 

movements where ownership of data is transferred back and General 0 contains the function code which 

forth under guaranteed interlock to eliminate out of order 45 g*cifiw the operation to be performed by the adapter, 

updates. All updates of both the shared states and queues General register 1 contains the subsystem-identifica ion 

must be in absolute synchronization. There is also a shared W0 ^ W ^ h T ^ e Tf t u ft ™&™Uon 

. * * „„ T „ , . , . , , and the QDIO adapter that is to be signaled. Depending on 

state interface control or SSIC mechanism used to control ^ ^ fas J m , 2 * ^ 

logical ownership of I/O buffers. bit parameter. The definition and purpose of this parameter 
Coupled with these initiatives is a new mechanism for so d ^ Qn ^ blctjoQ code when th e ^ 
software to interrogate status updates as described below. spec i nes either (1) initiate-output queues, or (2) initiate- 
Previously, this was provided exclusively via interruption. In input queueS) general register 2 specifies which input or 
this way the present invention enables interrogation across output queues are to be processed by the adapter, 
queues (multiple priorities) under control of a timer and, as Function Code O/Initiate Output — When the function 
described earlier, in periods of low activity, interrupts are 55 code specifies initiate-output, the associated QDIO adapter 
provided and then when activity reaches a certain threshold, is signaled to a synchronously process one or more output 
control is switched to use the timer. queues associated with the specified subchannel. In this 
The interface must be designed to establish a cooperative case, the instruction is referred to as SIGA-w (SIGNAL 
environment with the Upper Layer Protocols or the ULPs ADAPTER — write). The output queues that are to be pro- 
such that the cost to the ULP of executing I/O is minimized. 60 cessed are specified in general register 2. 
Cost reduction techniques for both small and large data Function code 1/Initiate Input — When the function code 
packets must be designed into the interface. Besides the specifies initiate-input, the associated QDIO adapter is sig- 
obvious costs of I/O in terms of instructions per operation, naled to a synchronously process one or more input queues 
there exists a set of other costs related to but not directly associated with the specified sub -channel. In this case, the 
measured against the cost of the current structures. These 65 instruction is referred to SIGA-r or Signal Adapter read. The 
may be generally described as the price I/O users pay in their input queues that are to be processed are specified in general 
own code base to either avoid or minimize the measurable register 2. 
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Function code 2/Synchronization — When the function 
code specifies synchronize, the virtual machine is signaled 
to update the data queues SLSB and SBAL entries in order 
to render them current as observed by both the program and 
the QDIO adapter. In this case, the instruction is referred to S 
as SIGA-s or Signal Adapter synchronize. 

SIGA-s is required in virtual machine models where 
QDIO data queue sharing between the program and the 
adapter is simulated by the use of separate unshared copies 
of the queues SLSB and SBA1 components. One copy of 10 
these components is used by the program and one copy is 
used by the adapter. The execution of SIGA-s signals the 
virtual machine to update these unshared copies for the data 
queues as necessary so that both the program and the QDIO 
adapter observe the same contents for these queues compo- 15 
nents. 

When SIGA-s is specified: 

1) The output queues for the designated subchannel that 
are to be synchronized are specified in general register 2. 

2) All input queues for the designated subchannel are 20 
synchronized. 

3) The QDIO adapter is not signaled. 

4) The virtual machine is signaled if the program is 
executing in a virtual machine environment. No virtual 
machine signal is generated when the program is not execut- 25 
ing in a virtual machine. 

For the SIGA-w and SIGA-r and SIGA-s functions, the 
second operand (B 2 D^) is ignored. 

When the SIGA-r and SIGA-w or SIGA-s functions are 
specified, general register 2 specifies a 32 bit parameter that 30 
designates which input or output queues are to be processed 
by the adapter. Bits 0 through 31 correspond one for one 
with input or output queues 0 through 31 respectively and 
are called queues indicators QI. Additionally, both input and 
output queues are prioritized by queue number with the 35 
lowest numbered queue (queue 0) having the highest priority 
and the highest numbered queue (queue 31) having the 
lowest priority. 

When a queue indicator is one and the corresponding 
queue is valid, the QDIO adapter is signaled to process the 40 
corresponding input or output queues. When a queue indi- 
cator is one and the corresponding input or output queue is 
invalid, the queue indicator is ignored. 

A queue is valid when it is established and is active. A 
queue is invalid when it is not established, is not active, or 45 
the model does not allow a queue to be established for the 
corresponding queue indicator. 

When the queue indicator is zero, no action is required to 
be taken at the adapter for the corresponding queues. When 
all queues indicators in general register 2 are zero, the 50 
adapter is not signaled and no other operation is performed. 

Subsequent to the execution of SIGA, the QDIO adapter 
associated with the designated subchannel performs the 
specified function. When the SIGA-w function is specified, 
the adapter processes each specified output queue in priority 55 
sequence. For each queue that contains queue-buffers in the 
primed state, the data in the buffers is transmitted and upon 
completion of transmission, the queue buffers are placed into 
the empty state. This process continues until the data in all 
primed output queue buffers, for all specified output queues, 60 
has been transmitted. 

When the SIGA-r function is specified, the adapter pro- 
cesses each specified input queue in priority sequence. For 
each queue that contains queue -buffers in the input buffer 
empty state, data is placed into the queue buffers as it is 65 
received and upon completion of the transmission, the queue 
buffers are placed into the input buffer primed state. This 



process continues for each empty queue buffer in sequence 
until a buffer that is not in the input buffer empty state is 
reached. This process is then repeated for the next lower 
priority input queue. If any queue buffers for all specified 
input queues have been filled with data. 
Shared State Interface Control 

Another important aspect of the present invention is its 
ability to share state interface. The Shared State Interface 
Control or SSIC function that provides shared state interface 
between the QDIO adapter and a QDIO program, such as a 
multipath channel program, can best be described in the 
following diagram: 



WRITE 

QDIO Program 

Fill V SBAL's with data 
set state to multiple 
SBAL's may be processed 
Issue SIGA to drive the 
adapter 



Program frees 'empty' 
write buffers after SIGA 

'last ditch' timer will free 
any lingering buffers 

READ 

QDIO Program 

If required, replace used 
buffers for multiple SB A LPs 
within each SBAL 
set state to 



State 
primed 



QDIO Adapter 



empty -* - 



Process all outbound data 
set state to 



-»> empty 



primed 



QDIO Adapter 



Fill inbound buffers 

for each SBA1 used 
set state to 

low traffic - new PCI 
else nothing 



Drain data and pass to ULP, 

Replace all used buffers 

set state to •»* empty 



n. Store Subchannel QDIO Data or CHSC 
Command 

Input/output operations for QDIO involve the use of an 
I/O device represented by a subchannel in the channel 
subsystem. The proper execution of QDIO I/O operations 
depends on certain characteristics of the subchannel. 
Examples of such characteristics are: 

whether the subchannel supports QDIO operations 

the format of the queues 

the number of input and output queues 

I/O-device requirements regarding program issuing of the 
SIGA instruction. 

The store -subchannel-QDIO-data command provides the 
program with a way to determine from the channel sub- 
system the QDIO characteristics (listed above) that the 
program must take into account in order to perform I/O 
operations using a specified subchannel. Previous mecha- 
nisms that allow programs to determine operational charac- 
teristics of I/O devices normally consist of the program 
executing a channel program to obtain such information 
from the I/O device. 

By providing the store -subchannel-QDIO-data command, 
it is possible for I/O devices to have different QDIO char- 
acteristics and for the program to determine what those 
characteristics are prior to communicating with the I/O 
device itself. 
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The CHSC command is used to obtain self description 
information for the QDIO adapters associated with a speci- 
fied range of subchannels. When the CPC is operating in a 
mode where several images are used, the CHSC command 
is used to obtain self description information for the QDIO S 
adapters associated with a specified range of subchannel 
images, configured to the logical partition that executed the 
command information for subchannel images configured to 
other logical partitions, if any, is not provided. FIG. 5 
represents the format for the command request block for 10 
store -subchannel-QDIO data. FIG. 6 represents the format 
for the command response block for the store-subchannel- 
QDIO data command. In addition, FIG. 6 includes 
Subchannel-QDIO description Block. 

In short the CHSC command specifies which device the 15 
request for processing can be sent to. It further provides for 
the format and attributes of the QDIO, such as the size and 
attribute of the queues, and other characteristics that may 
relate to the specific processor. QFMT or QDIO Queues 
Format and QDIO AC or QDIO Adapter characteristics in 20 
the Subchannel-QDIO description Block of FIG. 6 includes 
this information. IQCNT of the Subchannel-QDIO descrip- 
tion Block provides the Input Queues Count and OQCNT 
also of the Subchannel-QDIO description Block provides an 
Output Queue Count. 25 

HI. QDIO Priority Instructions 

The user can issue a request leading to a SETUP_REQ 
instruction. When processing this instruction a device 
address will be assigned to the user which will be passed 30 
along via a SETUP SDU instruction. The SETUP primitive 
will also pass priority queue information to the adapter. The 
format of this is shown in FIG. 4. Length is defined by 
Length of DIF including this field. Category is defined as the 
value of primitive specific. Type denotes the value of data 35 
path device address. DEV_CUA is a multi-digit CUA in 
packed format. DEV_NO. refers to the device number 
assigned to this ULP's connection. Priority Service Order is 
the order by which the adapter will service the queues. It is 
used to provide a favorable service for higher priority vs. 40 
lower priority queues. Maximum Service Limit Units refer 
to the units that are used under a favored treatment based on 
the amount of outbound data allowed to be processed during 
one processing interval. It can be defined in three flavors: 
maximum number of packets to be transmitted — counts 45 
packet size without regard to packet size; maximum number 
of bytes allowed to be transmitted; and maximum number of 
SBALs that may be transmitted — without regard to number 
of packets or amount of data within the SBAL. Maximum 
Service Unit Priority provides the number of units on a 50 
priority basis. 
Data Packing 

Data packing is another important feature that is affected 
by the present invention. As the cost of I/O decreases, the 
need to prorate traffic to reduce the cost per data element 55 
decreases. However, the need still exists and the present 
design will allow for a multi-path channel or MPC to 
perform data packing through the device driver code which 
"unpacks" packed data received from the ULPs directly into 
a StorageJlock^jVddress^ist array so that packed for- 60 
mat data is not handled directly. This approach is taken 
because packed data resides in slower memory than the 
Storage_Block_Address_Lists array provides. In addition, 
data packing for small objects is supported and non- 
contiguous headers for large objects is supported within a 65 
single data queue. In this context a non-contiguous header 
implies the use of a single entry for a network or control 



headers. A preferred ULP to be supported is TCP/IP which 
will build upon existing packing algorithms to reduce cost of 
I/O by continuing to pro-rate the cost across multiple 
datagrams. When an MPC is used, the device driver code 
will unpack the datagrams into the Storage_Block_ 
Address_List arrays. To provide for the efficient flow of 
large data objects, unpacked datagrams will also be sup- 
ported but the criteria upon whether a given flow is to be 
packed or not depends upon the size of the packet. To further 
optimize the system when TCP/IP is used, TCP/IP will 
include a controller work area, preferably a 32 byte header, 
and the start of the datagram for all data transfers. In all 
cases the controller area, if specified, must be provided by 
the ULP as part of any network or control header. This 
includes single datagram transfers where network headers, 
any control header, any defined data header and the user data 
have been moved to form a continuous bit stream. Headers 
must also be supplied when non-continuous header data- 
grams are used. MPC will not insert the header on behalf of 
the ULP. Note that an SBALE or a Storage_Block„ 
Address_List_Element is also defined, preferably with a 4k 
page limit to allow attachment of the Queued Direct I/O to 
different switches such as fiber optic switches and Interna- 
tional Business Machine's ESCON switch (ESCON is a 
registered trademark of IBM Corp. of Armonk). 

Another problem that severely impacts current systems is 
the lack of an efficient gather/scatter function. Since data 
chaining is exposed to the remote partner, it is no longer 
efficient for network communications. Yet data movements 
within the server continue to be major performance inhibi- 
tors for mid -size or large data objects. This problem is 
resolved by inventing an out-of-band header(s) such that the 
user data need not be moved or copied in construction of the 
data stream. 

The problems with system dispatching is also minimized 
by establishing a common user interface such that the user 
can assist in dispatch control. When an MPC is used, the 
MPC will establish a Direct Queue Area or a DQAfor each 
ULP exploiting the network attachment. This area will be 
used to control the queuing of inbound data as well as 
provide the control structure to be used for dispatching 
options and processing. 

The present invention has enhanced the existing system 
support for high performance applications that wish to take 
advantage of high speed media attach. Intent is to minimize 
inbound dispatching by providing a set of optional mecha- 
nisms that bypass the traditional SRB dispatch from disabled 
code that occurs during current I/O disabled completion. 
Since there is no change of ownership required for such 
protocols such as TCP/IP, the recovery procedure will no 
longer be needed in many instances. Also, no assigned 
buffers (ASSIGN BUFFER) are required for inbound traffic 
(TCP/IP). The data will not be blocked by the MPC or 
multipath channel and the interface layer will perform the 
deblocking function itself. Since MPC is not deblocking into 
smaller datagrams, there is no need for an assign buffer. The 
operation is driven by a disable timer during mid-high traffic 
rates, and all inbound queues for all interfaces will be 
processed via the timer mechanism, and fast interrupt indi- 
cators will be set off for all read data paths. This in turn will 
eliminate the need for some inbound dispatching functions 
like the use of MPC supplied Direct Queue Area. The ULP 
will include a user area for specific processing and the SBAL 
format will include the addresses and lengths of input data. 
A new function, IUTIL CM_ ACT is also provided that will 
contain fast dispatching (FAST DISPATCH) which in turn 
will allow the ULP to optimize its own environment. 
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Dynamic Configuration 

In the existing systems, all Gateway-types of attachments 
need to have a configuration file defined which identifies 
various items. These items include the following: 

1) Host Device Address — this definition is needed to 5 
define the Host Number and Host Unit address, especially 
when multiple or virtual images/machines are being used 
when passing data across any channel interface. This infor- 
mation is needed by the channel subsystem to determine 
which Host connection is to receive the incoming data. It is 10 
also needed for each Host or Host Unit Address which is to 

be used to transfer data across the channel interface to an 
adapter. 

2) Host Application — This identifies which Host Appli- 
cation is using the Host device Address. 15 

3) Application Specific Address — This address is used to 
identify the specific Application Server to which the inbound 
data received from the LAN is to be routed. Each Applica- 
tion Specific Address is directly related to the Host Device 
Address and Host Application. 20 

4) LAN Port Number — this identifies which LAN Port is 
to be used for sending data which is received at the Gateway 
from the Host Device Address. 

5) Default Routes — these are defined on a Host Applica- 
tion basis. Each Host Application can have a default Host 25 
Device Address specified. This Host Device Address is used 

to send all traffic received from the LAN for a specific Host 
Application for which an Application Specific Address has 
not been defined. For example, if a TCP/IP packet is 
received from the LAN and the TCP/IP address found in the 30 
packet was not defined in the configuration file, this packet 
would be seat to the Host over the Host Device Address 
defined by the Default Route entry. 

6) Setting Thresholds for Priority Traffic — this defines the 
percentages of processing which should be used on the 35 
various priority traffic. For example, this command could be 
used to define the maximum number of bytes which should 

be processed for a specific priority before moving on the 
check for work for a different priority. 

The present invention changes all that. All configuration 40 
information defined above is no longer needed in the con- 
figuration file. In fact, the configuration file is no longer 
required on the Gateway attachment using the QDIO Inter- 
face. All the information is presented to the Gateway device 
at initialization time through various tables and commands 45 
which are passed over the channel interface. 

A table is provided which maps all the Host images and 
Host Device Addresses which will be using the QDIO 
Interface to the specific bits defined in the SIGA vector. This 
list is derived directly from the information defined in the 50 
IOCDS on the Host. Each entry in the IOCDS which defined 
an ADIO device causes an entry to be placed in the initial 
table. At initialization time, each entry in the table is 
assigned a specific bit in the SIGA vector. Also, at any time 
after initialization, this information can be dynamically 55 
changed and Host Device addresses can be added and/or 
deleted. 

The Host Application which is to use the Host Device 
Address is defined using a command called MPC_ 
ENABLE -IC Command. The Application Specific Address 60 
is defined using the SETIP command. The Application 
Specific Address can also be deleted using the DELIP 
command. The LAN Port Number is specified in the STRT- 
LAN Control Command. The Default Routes are defined 
using the SETRTG Control Command. This is a new control 65 
command defined specifically in the present invention. Set- 
ting thresholds for priority traffic is defined using the SETP- 



RIORITYTHRESHOLD Control command which defines 
the maximum number of bytes which can be processed for 
a specific QDIO Priority QUEUE before checking for work 
on the other QDIO Priority QUEUES. This command allows 
the user to tailor each individual system for its specific 
application requirements. 

Using this and the queue priority instructions the specific 
algorithm which is to be used when servicing each of the 
different priority queues is addressed. Each Host Device has 
the ability to set its own unique priority algorithm. 
SIGA Vector Implementation 

The SIGA Vector is needed to give initiative to the QDIO 
connected Gateway device. One problem which is solved by 
the present invention is the use of Priority Queues and how 
a priority algorithm which needed to serve multiple priority 
queues at the specified priority values. Id other words, 
certain queues represented by the SIGA Vector needed to be 
completely serviced on each invocation because they were 
the highest priorities. Each queue at the next lowest priority 
needed to have the ability to have some of its traffic left 
pending if its thresholds for service were reached. The 
higher priority queues then needed to be rechecked if more 
work had come active while the lower priority queues still 
had work pending. 

To accomplish the above task, the SIGA Vector is split 
into a priority bit mask. Each Device Address which was 
assigned to the QDIO interface had one queue assigned for 
each of the possible priorities. In one embodiment of the 
present invention, there are four bits assigned to each of the 
different Device Addresses. When a certain priority work 
request needs to be sent, the bit corresponding to the Device 
Address and its corresponding priority is set. As requests 
come in from different priorities or from different Device 
Addresses, their bits would also be set. This gives the Host 
System the ability to five multiple different work requests in 
the same SIGA Mask. 

Another problem addressed is the effective service of 
various QDIO priorities when only a single bit is being used 
to signal the Gateway device work. Since it is possible that 
all the work for a certain priority would not be serviced 
before checking back for more work for the other priorities, 
the Gateway device needed to be able to remember the 
current work, but be able to go back and look for more new 
work. To do this, the Gateway device would write a specific 
value into the SIGA Vector area after each read of the vector. 
Once the Host code detected the value written by the 
Gateway device, the vector would be completely cleared and 
then new work requests were added. Clearing of the vector 
after each read enables the fairness algorithms so the dif- 
ferent priorities could be processed at their desired rates. 

One additional problem to be addressed is the number of 
bits which is needed to be scanned to identify the work 
requests. In one embodiment of the present invention, there 
are a possibility of 240 Device Addresses. Each Device 
Address has 4 priorities, so this computes to 4*240 or 960 
possible bit settings. Hie overhead of scanning all these bits 
to find the work requests is too high. To make the searching 
faster, the 960 bits are split into 30 different 32 -bit masks. 
When a new work request is added, the bit in one of the 30 
different 32-bit masks is set. Also, the bit in the Work Vector 
which corresponds to the 32-bit mask in which the bit was 
set is also set. 

The work vector which identified the specific 32-bit mask 
made finding the bits which were set much more efficient. 
The Gateway device can now scan the Work Vector to find 
the appropriate 32-bit mask. The Gateway device can then 
just fetch the proper 32-bit mask to find the work request. 
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In one embodiment of the present invention, all high 
priority traffic is handled completely and then the amount of 
data processed from the other queues is assigned a weight 
using the SETPRIORITYTHRESHOLD command. Once 
the lower priority queues have been handled, it is possible 5 
some data could be residual in these queues. It then becomes 
necessary to go back and check the priority queues if new 
requests have arrived. To make sure only new requests have 
been added to the List when it is refetched, each time the 
adapter reads the SIGA Vector, it sets a field to indicate the 10 
vector has been read. The next Host request will then see the 
adapter has read the SIGA Vector. It is then completely 
cleared by the Host code before setting the new request. 
Error Reporting During Run Time — Non Catastrophic 

As data is being transferred across the QDIO interface to 15 
and from the Gateway device, it is possible for errors to 
periodically occur in the data stream. Intermittent errors can 
be recovered. Errors which become persistent need to be 
detected so the interface can be taken down and then 
restarted. All this needs to happen at run time and require no 20 
user interventions. 

To accomplish this, Error States are defined for the SLSB 
Status Block. When the adapter detects errors in the data 
stream, an error state is set in the SLSB. The specific reason 
for the error is stored in the SBALF (SBAL Flags) which are 25 
located in the SBAL which is associated with the SLSB that 
has the error state set. Using this approach, the Host is able 
to monitor the number of errors which occur within a 
specified time period. If the number of errors exceeds the 
pre-determined threshold which has been set, the QDIO 30 
Connection is terminated. If the error rate stays under the 
specific threshold, the connection will remain active. 
Concurrent Patch 

Concurrent Patch is a feature provided in QDIO. The 
Concurrent Patch feature allows the customer to install a 35 
new level of microcode to the adapter without interrupting 
any of the applications and/or services using the adapter. For 
Channel adapters this was not a major problem because all 
of the applications using the channel adapter did not require 
any connection-type of information to be kept across the 40 
code update. 

For the Network Adapters which are using TCP/IP, the 
adapter contains information about each client station in the 
LAN and each connection which is present with the Host 
Applications. The connections are active once the adapter is 45 
activated and remain present while the card is active. There 
are no Gateway platforms today which will keep the TCP/IP 
sessions active during a code update. The QDIO Hydranet 
adapter is the first to offer the Concurrent Patch feature in a 
Gateway environment. 50 
QDIO in Virtual-Machine Environment 

The key control mechanism for QDIO is the storage -list- 
state block (SLSB), comprising a vector of state entries for 
each queue, with one entry per storage-block-address list 
(SBAL). An SBAL contains the addresses of a set of storage 55 
blocks within main memory, the collection of which is 
termed a buffer, either input or output. 

Each SLSB entry represents a finite -state machine (FSM), 
an automation well known in the art, defining the states of 
a computing process, the inputs and outputs of the process 60 
for each state, and the allowed transitions among the states. 
Whereas a standard FSM is executed by a single process, the 
FSM in an SLSB entry in this invention is shared and used 
as a control and communication mechanism by a host 
program on the one hand and by an I/O adapter on the other. 65 
The FSM is used by each to drive the other. The set of states 
of the FSM is strictly divided into two subsets, program- 
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owned states and adapter-owned states. The ownership is 
indicated by bits within the encodings of the state-values. 
Each side exchanges ownership with the other to cause 
control and processing to pass between them. 

Thus, the FSM of an SLSB entry embodies two sets (one 
each in the program and the adapter) of one or more 
processes under the control of the FSM definition. These sets 
of processes are kept separate and carefully controlled 
through the two distinct subsets of FSM states, implying 
ownership by one side or the other, as described above. 
However, within either side (program or adapter), multiple 
processes may share and be controlled by the FSM. Such 
sharing processes within a given side may use the state - 
values within its own side's ownership subset to control and 
communicate with other processes on its own side, but not 
the other side. That is, neither side is permitted to understand 
or act upon the meaning of a specific state-value that is 
owned by the opposite side, other than to transfer ownership 
according to the FSM definition. This strict separation of the 
program and the adapter within the FSM ensures that each 
side can be a free-running process (or set of processes) 
through the entire set of FSMs in an SLSB without the 
possibility of deadlock. 

Within the preferred implementation, there are separate 
FSM definitions for input and output queues. The five FSM 
states for input queues are as follows: 

* input buffer not initialized (program owned) 

* input buffer empty (adapter owned) 

* input buffer primed (program owned) 

* input buffer error (program owned) 

* input buffer halted (program owned) 

The five FSM states for output queues in the preferred 
implementation are as follows: 

* output buffer not initialized (program owned) 

* output buffer empty (program owned) 

* output buffer primed (adapter owned) 

* output buffer error (program owned) 

* output buffer halted (program owned) 

FIGS. 7 and 8 depict sample Input and Output queues as 
relating to this particular area as will be discussed below. 
With the FSM in each SLSB entry being executed coopera- 
tively but independently by the program and the adapter, the 
processing of an entire input or output queue is accom- 
plished by sequentially cycling through the full set of FSMs 
(and, hence, buffers) within the SLSB controlling the queue. 

The following control mechanisms is an abstract, simpli- 
fied version of the preferred implementation for the proper 
sequencing through the buffers. 
Output Queues: 

Program 

Current_Entry =1 ; 

LOOP: DO WHILE Current_State=TRIMED AND out- 
put data exists; 

Execute FSM for Current_Entry; 
Current„Entry-Current_Entry+l modulo SLSB_ 

Size; 
END; 

WAIT (for more data from application or Current_ 

State 
change); 
GO TO LOOP; 
Adapter 



03/23/2004, EAST Version: 1.4.1 



US 6,332,171 Bl 



19 



20 



CurrenUntry=l; 

LOOP: DO WHILE Current_State -PRIMED; 
Execute FSM for Current_Entry; 
Current_Entry=Current_Entry+l modulo SLSB_ 

Size; 5 
END; 

WAIT (for SIGA-w signal); 
GO TO LOOP; 
Input Queues: 

Program 10 



Current_Jintry=»l; 

LOOP: DO WHILE Current_State= A EMPTY; 
Execute FSM for Current_Entry; 
Current_Entry=Current_Entry+l modulo SLSB_ 

Size; 
END; 

WAIT (for PCI or timer interruption); 
GO TO LOOP; 
Adapter 
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Current_Entry=l; 

LOOP; DO WHILE Current_State=EMPTY AND input 
data exists; 25 
Execute FSM for Current_Entry; 
Current_Entry=Current_Entry+l modulo SLSB_ 

Size; 
END; 

WAIT (for more data, SIGA-r signal, or Current_State 30 

change); 
GO TO LOOP; 

These control mechanisms (i.e., the FSMs and the 
sequencing logic to loop through the FSMs in an SLSB) 
keep the program and the adapter in synchronism with each 35 
other without deadlock as the cooperating processes on each 
side move in tandem through different portions of the SLSB. 
The invariant conditions are that each side always processes 
FSM states not processed by the other, and as data is moved 
inbound or outbound, each side sets FSM states processed 40 
by the other. As long as one side is running, it sets states that 
will be processed by the other side, and vice versa. 

The QDIO protocol so far described is extended in the 
present invention to be used in a virtual-machine environ- 
ment through minor additions along with careful design and 45 
attention to the following considerations. 

A key aspect of QDIO is the shared-memory model by 
which the program and the adapter share a common queue 
structure and data areas in a computer's main memory. With 
the free -running cooperative processes described above, 50 
controlled by a set of FSMs in an SLSB for each data queue, 
the use of shared memory avoids the processor and channel- 
subsystem overhead of start-processing and one-for-one 
interruptions associated with traditional input/output opera- 
tions. 55 

Such a shared-memory model is problematic in the envi- 
ronment of a virtual machine, which is an image of a real 
machine created by a program called a virtual-machine 
hypervisor. The apparent real storage of the virtual machine 
is in fact pageable storage of the hypervisor. The adapter, 60 
lacking dynamic-address-translation (DAT) capability and 
the hypervisor's associated DAT tables, needs to know the 
actual real-storage addresses of the queue structures and 
data. 

The shared-memory model of the QDIO protocol is 65 
simulated by the virtual-machine hypervisor through the use 
of "shadow" copies of key control blocks that are main- 



tained by the hypervisor. The QDIO control-block structure 
is designed in such a way as to carefully separate and isolate 
main-memory addresses from non-address information. 

Among the QDIO control blocks, the storage list(SL) and 
storage-block-address list (SBAL) are designed specifically 
to contain addresses needed by the adapter. The queue- 
information block (QIB) and the storage-list-information 
block (SLIB) are designed specifically to exclude any such 
addresses. The memory pages containing the QIB and the 
SLIB are fixed in main memory by the hypervisor and, thus, 
follow the QDIO shared-memory model: the program 
accesses the QIB and the SLJB using addresses that are in 
fact virtual, while the adapter accesses these, same control 
blocks with real addresses. 

The SLs and SBALs are shadowed by the hypervisor. The 
SLSB is also shadowed, although it contains no addresses, 
because of its definition as the controlling mechanism for the 
program's and the adapter's cooperating processes. The 
changing of FSM states in the SLSB controls the program's 
and the adapter's access to the other queue components that 
require address translation, and hence, FSM state-changes 
must be gated and controlled by the hypervisor using the 
shadow-block mechanism. 

The QDIO protocol is started by the existing START 
SUBCHANNEL (SSCH) machine instruction in the pre- 
ferred implementation, but could be started by one or more 
new instructions defined for the purpose. For pageable 
virtual machines, SSCH is intercepted by the hypervisor so 
as to begin the simulation of the QDIO protocol. During the 
simulation of the Establish-QDIO- Queues channel 
command, the hypervisor builds shadow copies of the SL, 
SBAL, and SLSB control blocks. The queue -descriptor 
record (QDR) associated with the Establish-QDIO-Queues 
command contains the main-memory addresses of the QDIO 
queue components as seen by the program. The hypervisor 
translates those addresses, as well as addresses within the SL 
and SBALs, in building its own copy of the QDR and the 
shadow SL and SBALs. Translation of addresses within the 
SBALs may be delayed until the simulation of the Activate- 
QDIO-Queues channel command if the program chooses to 
defer its data-buffer assignments until the queues are acti- 
vated. 

Once the QDIO protocol is started, the virtual-machine 
hypervisor needs to intervene to perform address translation 
whenever the program presents a new set of empty or full 
buffers to the adapter for inbound or outbound data, respec- 
tively. The hypervisor also intervenes when synchronization 
is needed between the program's original SLSB and the 
hypervisor's shadow SLSB used by the adapter. Such 
address translation and SLSB synchronization is implicit 
during the hypervisor's interception of program-controlled 
interruptions (PCIs) and the SIGA-w and SIGA-r instruc- 
tions. The SIGA-s instruction causes hypervisor intervention 
when there is no signal needed between the program and the 
adapter in the non-virtual-machine environment, but there is 
nevertheless a need for address translation and SLSB syn- 
chronization for the virtual machine. In the preferred 
implementation, SIGA-s is used by the program when 
recovering emptied outbound buffers from the adapter and 
after a program timer expires to signal the need for checking 
of SLSB states on input queues. 

The previously-described FSM definitions and sequenc- 
ing protocols for the SLSB support and make possible the 
operation of QDIO in virtual machines. The concept of 
ownership of SBALs and data buffers, as embodied in the 
separate program-owned and adapter-owned states of the 
FSMs, means that the adapter never accesses main memory 
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for which the adapter does not have ownership within the 
applicable FSM. Ownership is only transferred from pro- * 
gram to adapter by the setting of an adapter-owned state in 
the FSM by the program and the subsequent synchronization 
of the program's FSM with the adapter's shadow FSM by 5 
the hypervisor, after the hypervisor performs the necessary 
address translation. Likewise, ownership is only transferred 
from adapter to program by the setting of a program-owned 
state in the FSM by the adapter and the subsequent syn- 
chronization of the real and shadow FSMs, after the hyper- 
visor updates the applicable real SBALs from the shadow 
SBALs with, for example, the actual data count moved 
through the adapter. 

The mutually-exclusive FSM-state subsets between the 
program and the adapter, with the rule of each side setting 
ownership by the other side to transfer processing between 15 
them, enables straight forward synchronization of the real 
and shadow SLSBs by the hypervisor. The hypervisor main- 
tains a "hidden shadow" copy of the SLSB to reflect the state 
of the SLSB at the previous synchronization point. This 
permits easy recognition of changes made by the program to 20 
the real SLSB and by the adapter to the shadow SLSB, 
allowing the proper updates in each direction between the 
real and shadow SLSBs with one pass through the three 
copies of the SLSB at each synchronization point. 

The mutually-exclusive FSM-state subsets and the 25 
sequencing rules through the SLSB entries further support 
virtualization by ensuring that synchronization by the hyper- 
visor does not disrupt or interfere with concurrent operations 
by the program and the adapter on their respective copies of 
the SLSB. The boundaries between program -owned and 3Q 
adapter-owned states constantly move downward through 
the SLSB and back to the top. Neither side looks beyond its 
own contiguous set(s) of owned FSMS, with the boundaries 
being apparent. This means the method of synchronization 
by the hypervisor, whether top-down, bottom-up, or middle- 
to-middle in either direction, can have no lasting effect of 35 
disrupting the program's or the adapter's operation. 

While the invention has been described in detail herein in 
accordance with certain preferred embodiments thereof, 
many modifications and changes therein may be effected by 
those skilled in the art. Accordingly, it is intended by the 40 
appended claims to cover all such modifications and changes 
as fall within the true spirit and scope of the invention. 

What is claimed is: 

1. In a network environment having a main storage, a 
queuing mechanism apparatus established in said main 45 
storage for receipt and transfer of incoming and outgoing 
data comprising: 

at least one set of dedicated input queues; 
at least one set of dedicated output queues; ^ 
a plurality of queuing components providing attributes of 
devices to and from which data is to be transferred or 
received as well as information about the queue mecha- 
nism itself; 

said input and output queues having an information block 55 
containing address of all input and output queues; 

said input and output queues also having a storage infor- 
mation block providing information about said queue 
mechanism, said block providing storage list informa- 
tion blocks defined for each queue which contains $q 
specific information about that queue; 

said input and output queue sets further having storage 
lists for identifying any input-output buffers) associ- 
ated with each queue -set; 

said input and output queue sets having a storage block 65 
address list for providing information about storage 
locations of any input-output buffer. 



2. The apparatus of claim 1, wherein said queue compo- 
nents collectively describe the queues characteristics and 
provide the necessary controls to allow exchange of data 
between a running program and an interface element in 
processing communication with said main storage and 
capable of connecting to one or more input or output 
devices. 

3. The apparatus of claim 1, wherein said queuing com- 
ponent further comprise a Queuing status block reflecting 
any changes dynamically as per changing I/O activity status. 

4. The apparatus of claim 1, wherein said queues comprise 
buffers which reflect channel ownership. 

5. The apparatus of claim 1, wherein separate images or 
logical partitions are provided for one or more virtual 
systems in said environment and each image is assigned a 
separate queue in the queuing mechanism. 

6. The apparatus of claim 1, wherein said queue compo- 
nent further comprise an Information block (QIB) providing 
information about the collection of input and output queues 
associated with a given subchannel existing in said envi- 
ronment. 

7. The apparatus of claim 6, wherein said QIB also 
provides information for collection of input and output 
queues associated with said subchannel. 

8. The apparatus of claim 7, wherein at least one QIB is 
defined per each subchannel. 

9. The apparatus of claim 1, wherein said queuing com- 
ponent further comprises a Storage List Information Block 
(SUB) which provides address of information stored per- 
taining to each queue. 

10. The apparatus of claim 9, wherein at least one SLIB 
is defined for each queue. 

11. The apparatus of claim 10, wherein said SLIB also 
contains information about a queue and has a header and 
entries called storage-list-information-block (SLIB) entries 
containing information about each of the buffers for each 
queue. 

12. The apparatus of claim 11, wherein a storage-list- 
information-block-element (SLIBE) is provided having 
information regarding data buffers as determined by a cor- 
responding Storage List entry. 

13. The apparatus of claim 11, wherein said queuing 
component comprises a Storage List (SL) used to defines the 
SBAL or storage block address lists that are defined for each 
I/O buffers associated with each Queue. 

14. The apparatus of claim 13, wherein one SL is defined 
for each queues which contains an entry for each QDIO-I/O 
buffers associated with said queue. 

15. The apparatus of claim 13, wherein said SL provides 
information about the I/O buffer locations in said main 
storage. 

16. The apparatus of claim 14, wherein said SL also 
provides absolute storage address of a storage-block- 
address-list (SBAL). 

17. The apparatus of claim 16, wherein each SBAL 
contains a list of absolute addresses of the storage blocks 
that collectively make up one of a plurality of data buffers 
associated with each queue. 

18. The apparatus of claim 17, wherein each SBAL also 
comprises of a storage Block address list entry (SBALE) 
containing absolute storage address of a storage block. 

19. The apparatus of claim 18, wherein collectively, said 
storage blocks addressed by all entries of a single SBAL 
constitute one of many possible buffers of a said queuing. 

20. The apparatus of claim 19, wherein the number of said 
buffers equal 128. 

21. The apparatus of claim 20, wherein each SBAL also 
comprises of flags or SBALF containing information about 
overall buffers associated with said SBAL containing in each 
SBALE. 
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22. The apparatus of claim 21, wherein description of 
contents of each SBALF field is different for each SBALE 
within said SBAL. 

23. The apparatus of claim 1, wherein said queuing 
components further comprises a Storage-List_State Block 5 
(SLSB) providing state indicators that provide state infor- 
mation about a plurality of buffers that make up a queue. 

24. The apparatus of claim 23, wherein each of said 
buffers further comprises a collection of storage blocks that 
can be located using all addresses in a single storage-block- 10 
address list. 

25. The apparatus of claim 22, wherein current state value 
in an SLSB entry can be changed by storing a new value in 
buffer entry. 

26. The apparatus of claim 23, wherein said SLSB further 15 
comprises a state of queues buffer N (SBQN) providing 
current state of a corresponding buffer. 

27. The apparatus of claim 26, wherein said buffer that 
corresponds to a given SLSB entry is determined by a 
storage list entry having same sequential position in a 20 
storage list as the SQBN field has in said SLSB. 

28. The apparatus of claim 27, wherein said state value 
comprises of two parts. 

29. The apparatus of claim 28, wherein a first part 
indicates whether a buffer is owned by a running program or 25 
a control unit and whether a buffer is an input or output 
buffer. 

30. The apparatus of claim 28, wherein a second part 
indicates current processing state of a buffer. 

31. The apparatus of claim 30, wherein certain bits can 30 
also be identified to mean different configurations. 

32. The apparatus of claim 31, wherein one bit can be 
established to indicates program ownerships. 

33. The apparatus of claim 32, wherein one or more bits 
can be established control unit ownership and buffer type 35 
respectively. 

34. The apparatus of claim 32, wherein at least one bit can 
be established to contain a value indicating current process- 
ing state of any associated buffer. 
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35. The apparatus of claim 34, wherein said state of 
associated buffer can be empty and available for data 
storage, primed and available to be processed, not initialized 
and not available for use, halted and containing valid data 
when data transfer was prematurely halted by a program 
executing Halt Subchannel, or Error and associating buffer 
in an error state when contents of said buffer are not 
meaningful. 

36. The apparatus of claim 1, wherein said queuing 
component further comprises a Storage Blocks (SB) defin- 
ing a single I/O buffer. 

37. In a network environment having a main storage, a 
method for transfer of incoming and outgoing data by 
establishing a queuing mechanism in said main storage 
comprising the steps of: 

dedicating at least one set of queues for input data; 

dedicating at least one set of queues for output data; 

establishing a plurality of queuing components in each 
queue set to providing attributes of devices to and from 
which data is to be transferred or received as well as 
information about the queue mechanism itself; 

establishing in said input and output queue sets an infor- 
mation block containing address of all input and output 
queues; 

establishing in said input and output queue sets a storage 
information block providing information about said 
queue mechanism, said block providing storage list 
information blocks defined for each queue which con- 
tains specific information about that queue; 

establishing in said input and output queue sets a storage 
list for identifying any input -output buffer(s) associated 
with each queue-set; 

establishing in said input and output queue sets a storage 
block address list for providing information about 
storage locations of any input-output buffer. 
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